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^j ■ Abstract 

f-H ' Consider a convex relaxation / of a pseudo-boolean function /. We say that the relaxation is totally 

half-integral if f(x) is a polyhedral function with half-integral extreme points x, and this property is 
preserved after adding an arbitrary combination of constraints of the form x% = Xj, Xi — 1 — Xj, and 
xt = 7 where 7 £ {0, 1, ^} is a constant. A well-known example is the roof duality relaxation for 
quadratic pseudo-boolean functions /. We argue that total half-integrality is a natural requirement for 
generalizations of roof duality to arbitrary pseudo-boolean functions. 

"iJ ■ Our contributions are as follows. First, we provide a complete characterization of totally half- 

integral relaxations / by establishing a one-to-one correspondence with bisubmodular functions. Sec- 
ond, we give a new characterization of bisubmodular functions. Finally, we show some relationships 

, — * I between general totally half-integral relaxations and relaxations based on the roof duality. 

o: 

1 Introduction 

> '■ 

•'-j ■ Let V be a set of |V| = n nodes and B C /C 1/2 C K. be the following sets: 

%'• B = {0,l} v K 1/2 = {0,±,1} V /C = [0,1] V 

A function / : B — >• R is called pseudo-boolean. In this paper we consider convex relaxations / : /C — > R 
of / which we call totally half-integral: 

Definition 1. (a) Function f : V — > R where V C K, is called half-integral if it is a convex polyhedral 
function such that all extreme points of the epigraph {(x, z) \ x € V, z > f(x)} have the form (x, f{x)) 
where x G /C 1/2 . (b) Function f : K — > R is called totally half-integral if restrictions f : V — > R are 
half-integral for all subsets V C /C obtained from K, by adding an arbitrary combination of constraints of 
the form x\ = Xj, X{ = Xj, and Xi = 7 for points x £ /C. Here i, j denote nodes in V, 7 denotes a constant 
in {0, 1, g}. andz = 1 — z. 

A well-known example of a totally half-integral relaxation is the roof duality relaxation for quadratic 
pseudo-boolean functions f(x) = Y^i C i x i + Yl(i j) c ij x i x j studied by Hammer, Hansen and Simeone [ 13]. 
It is known to possess the persistency property: for any half-integral minimizer x £ argmin/(£) there 
exists minimizer x £ argmin/(s) such that Xi = Xi for all nodes i with integral component X{. This 
property is quite important in practice as it allows to reduce the size of the minimization problem when 
x ^ j. The set of nodes with guaranteed optimal solution can sometimes be increased further using the 
PROBE technique [6], which also relies on persistency. 
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The goal of this paper is to generalize the roof duality approach to arbitrary pseudo-boolean functions. 
The total half-integrality is a very natural requirement of such generalizations, as discussed later in this 
section. As we prove, total half-integrality implies persistency. 

We provide a complete characterization of totally half-integral relaxations. Namely, we prove in sec- 
tion|2]that if / : /C — > R is totally half-integral then its restriction to /C 1/2 is a bisubmodular function, and 
conversely any bisubmodular function can be extended to a totally half-integral relaxation. 

Definition 2. Function f : /C 1/2 — > M is called bisubmodular if 

f(xny) + f(xUy) < f(x) + f(y) Vx,y€lC 1/2 (1) 

where binary operators l~l, U : /C 1/2 x /C 1/2 — > /C 1/2 are defined component-wise as follows: 



(2) 



As our second contribution, we give a new characterization of bisubmodular functions (section [3]>. 
Using this characterization, we then prove several results showing links with the roof duality relaxation 
(section 01). 

1.1 Applications 

This work has been motivated by computer vision applications. A fundamental task in vision is to infer 
pixel properties from observed data. These properties can be the type of object to which the pixel belongs, 
distance to the camera, pixel intensity before being corrupted by noise, etc. The popular MAP-MRF ap- 
proach casts the inference task as an energy minimization problem with the objective function of the form 
f(x) = J2c fc( x ) where C C V are subsets of neighboring pixels of small cardinality (\C\ = 1,2,3,...) 
and terms fc(x) depend only on labels of pixels in C. 

For some vision applications the roof duality approach [13] has shown a good performance J291I3T1 
|22] [23] [32l [H \Wi [13 HI Functions with higher-order terms are steadily gaining popularity in computer 
vision lf30l l32l [Tl [Tol, [T71 : it is generally accepted that they correspond to better image models. Therefore, 
studying generalizations of roof duality to arbitrary pseudo-boolean functions is an important task. In 
such generalizations the total half-integrality property is essential. Indeed, in practice, the relaxation / is 
obtained as the sum of relaxations fc constructed for each term independently. Some of these terms can be 
c\xi—Xj\ a.ndc\xi + Xj — l\. If cis sufficiently large, then applying the roof duality relaxation to these terms 
would yield constraints Xi = Xj and x = Xj present in the definition of total half-integrality. Constraints 
Xi = 7 € {0, 1, i} can also be simulated via the roof duality, e.g. xi = Xj,X{ = Xj for the same pair of 
nodes i,j implies x\ = x 3 ■ = |. 

1.2 Related work 

Half-integrality There is a vast literature on using half-integral relaxations for various combinatorial 
optimization problems. In many cases these relaxations lead to 2-approximation algorithms. Below we list 
a few representative papers. 

The earliest work recognizing half-integrality of polytopes with certain pairwise constraints was per- 
haps by Balinksi [3], while the persistency property goes back to Nemhauser and Trotter [27] who con- 
sidered the vertex cover problem. Hammer, Hansen and Simeone lfT3l established that these properties 



1 In many vision problems variables Xi are not binary. However, such problems are often reduced to a sequence of binary 
minimization problems using iterative move-making algorithms, e.g. using expansion moves |9| or fusion moves [22 . 23 321 1171 . 



hold for the roof duality relaxation for quadratic pseudo-boolean functions. Their work was generalized to 
arbitrary pseudo-boolean functions by Lu and Williams [24]. (The relaxation in [24 J relied on converting 
function / to a multinomial representation; see section [4] for more details.) Hochbaum [14, 15] gave a 
class of integer problems with half-integral relaxations. Very recently, Iwata and Nagano [18] formulated a 
half-integral relaxation for the problem of minimizing submodular function f(x) under constraints of the 
form X{ + Xj > 1. 

In computer vision, several researchers considered the following scheme: given a function f(x) = 
J2 fc{ x )> convert terms fc(x) to quadratic pseudo-boolean functions by introducing auxiliary binary 
variables, and then apply the roof duality relaxation to the latter. Woodford et al. IT321 used this technique 
for the stereo reconstruction problem, while Ali et al. [1] and Ishikawa [ 16] explored different conversions 
to quadratic functions. 

To the best of our knowledge, all examples of totally half-integral relaxations proposed so far belong to 
the class of submodular relaxations, which is defined in section |4] They form a subclass of more general 
bisubmodular relaxations. 

Bisubmodularity Bisubmodular functions were introduced by Chandrasekaran and Kabadi as rank func- 
tions of (poly-)pseudomatroids |[T0llT9l . Independently, Bouchet [7] introduced the concept of A-matroids 
which is equivalent to pseudomatroids. Bisubmodular functions and their generalizations have also been 
considered by Qi [28], Nakamura l26l . Bouchet and Cunningham [8] andFujishige [11]. The notion of the 
Lovdsz extension of a bisubmodular function introduced by Qi [28] will be of particular importance for our 
work (see next section). 

It has been shown that some submodular minimization algorithms can be generalized to bisubmodular 
functions. Qi [28] showed the applicability of the ellipsoid method. A weakly polynomial combinatorial 
algorithm for minimizing bisubmodular functions was given by Fujishige and Iwata lfT2l . and a strongly 
polynomial version was given by McCormick and Fujishige [251. 

Recently, we introduced strongly and weakly tree-submodular functions [21] that generalize bisubmod- 
ular functions. 

2 Total half-integrality and bisubmodularity 

The first result of this paper is following theorem. 

Theorem 3. If f : /C —?■ R is a totally half-integral relaxation then its restriction to /C 1/2 is bisubmodular. 
Conversely, if function f : /C 1/2 — > R is bisubmodular then it has a unique totally half-integral extension 
f : K ->• R. 

This section is devoted to the proof of theorem |3] Denote £ = [—1, l] v , C 1/2 = {—1, 0, 1} V . It will be 
convenient to work with functions h : C — > R and h : C 1/2 — > R obtained from / and / via a linear change 
of coordinates X{ t-t 2xj — 1. Under this change totally half-integral relaxations are transformed to totally 
integral relaxations: 

Definition 4. Let h : C — > R be a function ofn variables, (a) h is called integral if it is a convex polyhedral 
function such that all extreme points of the epigraph {{x, z) \ x G C, z > h(x)} have the form (x, h{x)) 
where x € C 1/2 . (b) h is called totally integral if it is integral and for an arbitrary ordering of nodes the 
following functions ofn — 1 variables (ifn > 1) are totally integral: 



for any constant 7 G { — 1, 0, 1} 
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The definition of a bisubmodular function is adapted as follows: function h : C 1/2 — > R is bisubmodular 
if inequality dH) holds for all x, y G £ 1/2 where operations n, U are defined by tables §2} after replacements 
i-> — 1, I i-> 0, 1 i-> 1. To prove theorem [3) it suffices to establish a link between totally integral 
relaxations h : £ — > R and bisubmodular functions /i : £ 1/2 — > R. We can assume without loss of 
generality that h(0) = h(0) = 0, since adding a constant to the functions does not affect the theorem. 

A pair u> = (n, cr) where it : V — > {1, . . . , n} is a permutation of V and a G {—1, 1}^ will be called 
a signed ordering. Let us rename nodes in V so that 7r(«) = i. To each signed ordering u we associate 
labelings x°, x 1 , . . . , x n G £ 1/2 as follows: 

as° = (0,0,...,0) x 1 = (a 1 ,0,...,0) ... x n = (a x ,a 2 , . . . ,a n ) (3) 

where nodes are ordered according to -it. 

Consider function h : C 1/2 — > R with h(0) = 0. Its Lovdsz extension h : R y — > R is defined in the 
following way EBll . Given a vector x G M. v , select a signed ordering uj = (it, a) as follows: (i) choose it 
so that values |xj|, i G V are non-increasing, and rename nodes accordingly so that \x±\ > . . . > \x n \; (ii) 
if X{ / set o~i = sign(xj), otherwise choose cij G {—1,1} arbitrarily. It is not difficult to check that 



x = 

i=l 



J2 x i xi < 4a ) 



where labelings x l are defined in © (with respect to the selected signed ordering) and A, = |xj| — |xj + i| 
for i = 1, . . . , n — 1, A n = \x n \. The value of the Lovasz extension is now defined as 

n 

h(x) = Y, hKx 1 ) (4b) 

i=l 

Theorem 5 ([28]). Function h is bisubmodular if and only if its Lovdsz extension h is convex on C. g 

Let £oj be the set of vectors in C for which signed ordering u = (ir, a) can be selected. Clearly, 
C u = {x G C | |xi | > . . . > \x n \, XiOi > Vi G V}. It is easy to check that C u is the convex hull of 
n + 1 points ([3]). Equations (0]) imply that h is linear on C u and coincides with h in each corner x°, . . . , x n . 

Lemma 6. Suppose function h : C ->lis totally integral. Then h is linear on simplex C w for each signed 
ordering to = (tt, cr). 

Proof. We use induction on n = \V\. For n = 1 the claim is straightforward; suppose that n > 2. Consider 
signed ordering uj = (n, cr). We need to prove that h is linear on the boundary dC^; this will imply that g 
is linear on L w since otherwise h would have an extreme point in the the interior £ w \9£ w which cannot be 
integral. 

Let X = {x°, . . . , x n } be the set of extreme points of C^ defined by ([3]). The boundary dC u is the 
union of n + 1 facets £° , . . . , £™ where £^ is the convex hull of points in X\{x 1 }. Let us prove that h is 
linear on £°. All points cc G X\{a? } satisfy xi = o\, therefore C% = {x £ C w \ x\ = o~x}. Consider 
function of n — 1 variables h'(x2, • • • , x n ) = h{o~\,X2, • • • , a? n ), and let £® be the projection of £° to 
K^Vi 1 ), By the induction hypothesis h! is linear on £^°, and thus h is linear on £°. 

The fact that h is linear on other facets can be proved in a similar way. Note that for i = 2, . . . , n — 1 
there holds C^ = {x G C w \ X{ = ctj_io-jXj_i}, and for % = n we have £™ = {a; G C u \ x n = 0}. 

□ 



2 Note, Qi formulates this result slightly differently: h is assumed to be convex on R v rather than on C However, it is easy 
to see that convexity of h on £ implies convexity of h on R . Indeed, it can be checked that h is positively homogeneous, i.e. 
h^x) — jh(x) for any 7 > 0, x £ R v . Therefore, for any x, y € R v and a, /3 > with a + /3 = 1 there holds 

/i(aa; + f3y) — —h(a.^x + fl^y) < —h{pjx) H h{^y) = a/i(a;) + f3h(y) 

7 7 7 

where the inequality in the middle follows from convexity of ft on £, assuming that 7 is a sufficiently small constant. 



Corollary 7. Suppose function h : C — Y R with h(0) = is totally integral. Let h be the restriction ofh to 
£ 1/2 and h be the Lovdsz extension ofh. Then h and h coincide on C. 

Theorem [5J and corollary [7] imply the first part of theorem[3] The second part will follow from 

Lemma 8.1fh: £ 1/2 — > R with h(0) = is bisubmodular then its Lovdsz extension h : C — > Wis totally 
integral. 

Proof. We use induction on n = \V\. For n = 1 the claim is straightforward; suppose that n > 2. By 
theorem [51 h is convex on C. Function h is integral since it is linear on each simplex £ w and vertices of £ w 
belong to £ 1/2 . It remains to show that functions h! considered in definition 0] are totally integral. Consider 
the following functions ti : {-1, 0, l} y \{"} -»■ M: 

h'(xi,...,x n -i) = h(x 1 ,...,x n -.i,x n -i) 
h'(xi,...,x n -i) = h(xi,...,x n - 1 ,-x n -i) 
h'(xi,...,x n -i) = h(xi,...,x n -.i,i) , 7 G {-1,0,1} 

It can be checked that these functions are bisubmodular, and their Lovasz extensions coincide with respec- 
tive functions h! used in definition 01 The claim now follows from the induction hypothesis. □ 

3 A new characterization of bisubmodularity 

In this section we give an alternative definition of bisubmodularity; it will be helpful later for describing 
a relationship to the roof duality. As is often done for bisubmodular functions, we will encode each half- 
integral value Xi E {0, 1, ^} via two binary variables (m, Ui>) according to the following rules: 

0h(0, 1) 1-^(1,0) ±o(0,0) 

Thus, labelings in /C 1/2 will be represented via labelings in the set 

x- = {u G {0, if | ( Ui ,w) + (1, 1) ViG V} 

where V = {i, %' \ i G V} is a set with 2n nodes. The node i' for i G V is called the "mate" of i; intuitively, 
variable U{i corresponds to the complement of m. We define (£')' = i for i G V. Labelings in X~ will be 
denoted either by a single letter, e.g. u or v, or by a pair of letters, e.g. (x,y). In the latter case we assume 
that the two components correspond to labelings of V and V\V, respectively, and the order of variables 
in both components match. Using this convention, the one-to-one mapping X~ — > /C 1/2 can be written 
as (x,y) h- >• i(a; + y). Accordingly, instead of function / : /C 1/2 ->Rwe will work with the function 
g : X~ -t R defined by 

g( X ,y) = f(^±l) (5) 



Note that the set of integer labelings B C /C 1/2 corresponds to the set X° = {u G X \ (ui,Ui>) ^ (0, 0)}, 
so function g : X ~ — > R can be viewed as a discrete relaxation of function g : X° — >• R. 

Definition 9. Function f : X~ — > R is called bisubmodular if 

/(iin») + /(uu») < f{u) + f(v) \/u,veX~ (6) 

where uUv = u /\v, uUv= REDUCE(tt V v) and REDUCE(w) is the labeling obtaining from w by 
changing labels (v)i, w^) from (1,1) to (0,0) for alii G V. 



To describe a new characterization, we need to introduce some additional notation. We denote X = 
{0, 1} V to be the set of all binary labelings of V. For a labeling u £ X, define labeling u' by (u')i = Ui>. 
Labels (m , uy ) are transformed according to the rules 

(0,1) -+(0,1) (1,0) -> (1,0) (0,0)->(l,l) (1,1)^(0,0) (7) 

Equivalently, this mapping can be written as (a?, y)' = (y,x). Note that u" = u, (u A v)' = v! V v' and 
(u V v)' = u' A v' for u, v € X . Next, we define sets 

X~ = {ueX\u<u'} = {ueX\(u i ,u' i )^(l,l) VieF} 
X + = {uGX\u>u'} = {u£X\(ui,Ui)^ (0,0) VieF} 
x° = {u€X\u = u'} = {ueX\(u i ,u' i )e {(0,1), (1,0)} Vi e F} = ;t~ n ;r + 

Af* = AT - U <Y+ 

Clearly, u S A" - if and only if u' € < ; f + . Also, any function g : X~ — > R can be uniquely extended to a 
function g : X* — > R so that the following condition holds: 

<7(t0 = </(«) v«er (8) 

Proposition 10. Le? ^ : X* -^-151 be a function satisfying ([8]>. The following conditions are equivalent: 

(a) g is bisubmodular, i.e. it satisfies ©. 

(b) g satisfies the following inequalities: 

g(u A v) + g(u V v) < g(u) + g(v) if u,v,u Av,uV v £ X* (9) 

fcj (7 satisfies those inequalities in ® for which u = w V e\ v = w V e J where w = u A v and i,j 
are distinct nodes in V with Wi = Wj = 0. Here e k for node k € V denotes the labeling in X with 
e k k = 1 and e\, = Ofor k' G V\{k}. 

(d) g satisfies those inequalities in ^for which u = w V e l , v = w V e? where w = U A V and i, j are 
distinct nodes in V with Z{ = Zj = 0. 

A proof is given in Appendix A. Note, an equivalent of characterization (c) was given by Ando et al. O; 

we state it here for completeness. 

Remark 1 We reformulated the bisubmodularity condition using standard operations A, V : X x X — » X . 

This will be important in the next section for making a connection to the roof duality relaxation, which also 

uses operations A, V. It is worth noting that set X* C X is not closed under A, V. (If X* were closed under 

A, V then © would be a definition of a submodular function on a distributive lattice.) 

Remark 2 In order to compare characterizations (b,d) to existing characterizations (a,c), we need to 

analyze the sets of inequalities in (b,d) modulo eq. ®, i.e. after replacing terms g(w), w € X + with 

g(w'). In can be seen that the inequalities in (a) are neither subset nor superset of those in (bjj, so (b) is a 

new characterization. It is also possible to show that from this point of view (c) and (d) are equivalent. 



3 Denote u = ( n n ) anc ^ v ~ ( 1 ) w h ere the top and bottom rows correspond to the labelings of V and V\V 
respectively, with \V\ —4. Plugging pair (u, v) into ((6) gives the following inequality: 

/0000\. ( 1 1 \ ^ /1010\, / 1 \ 

s^ooooj+^ooooj -fi'V 0000 / s, V 00 i ° / 

This inequality is a part of (a), but it is not present in (b): pairs (u, v) and («', v') do not satisfy the RHS of l|9}, while pairs 
(u, v') and (u 1 , v) give a different inequality: 

/1000\,/0I0 0\^/1010\,/0100\ 

^ooooj+^ooooj -^ooooj+^oo 1 o) 

where we used condition {8}, Conversely, the second inequality is a part of (b) but it is not present in (a). 
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4 Submodular relaxations and roof duality 

Consider a submodular function g : X — > R satisfying the following "symmetry" condition: 

g{u')=g(u) \/u£X (10) 

We call such function g a submodular relaxation of function f(x) = g(x,x). Clearly, it satisfies conditions 
of proposition [lOl so g is also a bisubmodular relaxation of /. Furthermore, minimizing g is equivalent 
to minimizing its restriction g : X~ — > R; indeed, if it G X is a minimizer of g then so are u' and 
uAu'e x~. 

In this section we will do the following: (i) prove that any pseudo-boolean function / : B — > R has a 
submodular relaxation g : X — >■ R; (ii) show that the roof duality relaxation for quadratic pseudo-boolean 
functions is a submodular relaxation, and it dominates all other bisubmodular relaxations; (iii) show that for 
non-quadratic pseudo-boolean functions bisubmodular relaxations can be tighter than submodular ones; (iv) 
prove that similar to the roof duality relaxation, bisubmodular relaxations possess the persistency property. 
Review of roof duality Consider a quadratic pseudo-boolean function / : B — V R: 

/0 B ) = £/<( a: <) + Yl fij( X i' X j) C 11 ) 

where (V, E) is an undirected graph and Xi G {0, 1} for i G V are binary variables. Hammer, Hansen 
and Simeone IT31 formulated several linear programming relaxations of this function and showed their 
equivalence. One of these formulations was called a roof dual. An efficient maxflow-based method for 
solving the roof duality relaxation was given by Hammer, Boros and Sun [5J|4]]. 

We will rely on this algorithmic description of the roof duality approach Q. The method's idea can be 
summarized as follows. Each variable X{ is replaced with two binary variables m and Ui> corresponding to 
Xi and 1 — Xi respectively. The new set of nodes is V = {i, i' \ % G V}. Next, function / is transformed to 
a function g : X — > R by replacing each term according to the following rules: 

fifa) H- \[fi{ui) + fi{u v )] (12a) 

fij(xi,Xj) ^ ^[fij(ui,Uj) + fij(ui',Uj>)} if fij(-, •) is submodular (12b) 

fij(xi,Xj) ^ -[fij(ui,Uji) + fij{ui',Uj)] if fij{-, •) is not submodular (12c) 

g is a submodular quadratic pseudo-boolean function, so it can be minimized via a maxflow algorithm. If 
u G X is a minimizer of g then the roof duality relaxation has a minimizer x with Xj = h(u; + un) fll. 

It is easy to check that g(u) = g(u') for all u G X, therefore g is a submodular relaxation. Also, / and 
g are equivalent when U{> = Ui for all i G V, i.e. 

g(x,x) = f(x) VrrGS (13) 

Invariance to variable flipping Suppose that g is a (bi-)submodular relaxation of function / : B — > R. 
Let i be a fixed node in V, and consider function f'(x) obtained from f(x) by a change of coordinates 
Xi i-> afj and function g'(u) obtained from g(u) by swapping variables m and u^. It is easy to check that 
g' is a (bi-)submodular relaxation of /'. Furthermore, if / is a quadratic pseudo-boolean function and g is 
its submodular relaxation constructed by the roof duality approach, then applying the roof duality approach 
to /' yields function g' . We will sometimes use such "flipping" operation for reducing the number of 
considered cases. 

Conversion to roof duality Let us now consider a non-quadratic pseudo-boolean function / : B — > R. 
Several papers ll32l [TllT6l proposed the following scheme: (1) Convert / to a quadratic pseudo-boolean 
function / by introducing k auxiliary binary variables so that f(x) = rmn Q;6 ro j i}fc f{x, a) for all labelings 



x G B. (2) Construct submodular relaxation g(x, a, y, f3) of / by applying the roof duality relaxation to 
/•then 

g(x,a,y,f3)=g(y,f3,x,a) , g(x,a,x,a) = f(x,a) Vx,y G B, a, (3 G {0, l} k 

(3) Obtain function g by minimizing out auxiliary variables: g(x, y) = min^ g e t 1 ik g{x, a, y, (3). 

One can check that g(x, y) = g(y,x), so g is a submodular relaxatioqj. In general, however, it may 
not be a relaxation of function /, i.e. ( fT3l) may not hold; we are only guaranteed to have g(x,x) < f(x) 
for all labelings x G B. 

Existence of submodular relaxations It is easy to check that if / : B — > R is submodular then function 
g(x, y) = \ [f(x) + f(y)} is a submodular relaxation of /Jj Thus, monomials of the form clljg^Xj where 
c < and A C V have submodular relaxations. Using the "flipping" operation X{ i-> Xi, we conclude that 
submodular relaxations also exist for monomials of the form cHi e AXiHi^BXi where c < and A, B are 
disjoint subsets of U. It is known that any pseudo-boolean function / can be represented as a sum of such 
monomials (see e.g. Q; we need to represent —/as a posifonn and take its negative). This implies that 
any pseudo-boolean function / has a submodular relaxation. 

Note that this argument is due to Lu and Williams [24] who converted function / to a sum of monomials 
of the form cHi^AXi and cxk^-i^AXi, c < 0, k ^ A. It is possible to show that the relaxation proposed 
in [24] is equivalent to the submodular relaxation constructed by the scheme above (we omit the derivation). 
Submodular vs. bisubmodular relaxations An important question is whether bisubmodular relaxations 
are more "powerful" compared to submodular ones. The next theorem gives a class of functions for which 
the answer is negative; its proof is given in Appendix B. 

Theorem 11. Let g be the submodular relaxation of a quadratic pseudo-boolean function f defined by (1121 ). 
and assume that the set E does not have parallel edges. Then g dominates any other bisubmodular relax- 
ation g of f, i.e. g(u) > g{u)for all u G X~ . 

For non-quadratic pseudo-boolean functions, however, the situation can be different. In Appendix C 
we give an example of a function / of n = 4 variables which has a tight bisubmodular relaxation g (i.e. g 
has a minimizer in X°), but all submodular relaxations are not tight. 

Persistency Finally, we show that bisubmodular functions possess the autarky property, which implies 
persistency. 

Proposition 12. Let f : /C 1/2 — > R be a bisubmodular function and x € /C 1/2 be its minimizer. 

[Autarky] Let y be a labeling in B. Consider labeling z = (y U x) U x. Then z G B and f(z) < f(y). 

[Persistency] Function f : B — )• R has a minimizer x* G B such that x* = Xj for nodes i G V with 
integral X{. 

Proof. It can be checked that Z{ = yi if Xi = | and Z{ = Xj if Xj G {0, 1}. Thus, z G B. For any w G /C 1/2 
there holds f(w U x) < f(w) + [f(x) - f(w n x)] < f(w). This implies that f((y U x) U x) < f(y). 
Applying the autarky property to a labeling y G argmin{/(;r) | x G B } yields persistency. □ 



4 It is well-known that minimizing variables out preserves submodularity. Indeed, suppose that h(x) = mine* h(x, a) where 
h is a submodular function. Then h is also submodular since 

h(x) + h(y) = h(x, a) + h(y, (3) > ft (as A y, ex A (3) + h(x V y, a V (3) > h(x Ay) + h(x V y) 

5 In fact, it dominates all other bisubmodular relaxations g : X~ — > R of /. Indeed, consider labeling (x,y) G X~ . It 
can be checked that (x,y) — u\lv — u\Jv where u — (x,x) and v — (y,y), therefore g(x,y) < ^[g(u) + g(v)] — 
\[f{x) + f{y)]=g{x,y). 



5 Conclusions and future work 

We showed that bisubmodular functions provide a natural generalization of the roof duality approach to 
higher-order terms. This can be viewed as a non-submodular analogue of the fact that submodular functions 
generalize the s-t minimum cut problem with non-negative weights to higher-order terms. 

As mentioned in the introduction, this work has been motivated by computer vision applications that use 
functions of the form f(x) = ^2 C fc{x). An important open question is how to construct bisubmodular 
relaxations fc for individual terms. For terms of low order, e.g. with \C\ = 3, this potentially could be 
done by solving a small linear program. 

Another important question is how to minimize such functions. Algorithms in |fT2ll25l are unlikely to 
be practical for most vision problems, which typically have tens of thousands of variables. However, in our 
case we need to minimize a bisubmodular function which has a special structure: it is represented as a sum 
of low-order bisubmodular terms. We recently showed [20] that a sum of low-order submodular terms can 
be optimized more efficiently using maxflow-like techniques. We conjecture that similar techniques can be 
developed for bisubmodular functions as well. 

Appendix A: Proof of proposition [TO] (definitions of bisubmodularity) 

Directions (a)=^(c) and (b)=Kd) are trivial. Below we prove directions (b)=^(a), (d)=>(b) and (c)=^>(d). We 
use the following notation: for a labeling u G X and distinct nodes i,j G Vwe denote [u]i = (tij,Uj/), 

[u]ij = (Ui,Uif,Uj,Uj>). 

Direction (b)=Ka) For labelings u, v G X~ define a = u A v', f3 = u' A v. Clearly, a,/3e X ~ . Also, 

a A (3 = uF\v and a V /3 = u U v. We can write 

g(unv)+g(uUv) = g{a A 0) + g{a V 0) < g(a) + g{(3) = g{a) + g{f3') 
= g(u A v') + g{u V v') < g(u) + g(v') = g(u) + g(v) 

where we used conditions §&§ and (©. (It can be checked that all labelings involved belong to X*.) 
Direction (d)=Kb) We show that all inequalities in (© hold using induction on the Hamming distance 
||w — v\\x = J2iev \ Ui ~ v i\ between u and v. The base case ||« — v||i < 2 is straightforward: iflabelings 
u, V, u V V, u A V € X* are all distinct then © follows directly from condition (d), otherwise (© is an 
equality. 

Suppose \\u — v\ |i > 3 and u,v,u\/ v,u Av € X*. We can assume by symmetry that m = 1, Vi = 
for at least two nodes i € V. Among such nodes, let us choose i as follows: (a) if there is such i with 
U{i = 1 pick this node; (b) otherwise if there is such i with vy = 1 pick this node; (c) otherwise pick an 
arbitrary such node. Let u be the labeling obtained from u by switching the label of i from 1 to 0. It can 
be checked that «V(uVu) = u\J v and uA(uVv) = u. There holds | \u — (u V v)\ |i = \{j G V \ Uj < 
Vj}\ + 1 < \{j G V | Uj < Vj}\ + \{j G V | Uj > Vj}\ = \\u — v\\i and ||w — v\\i = \\u — v\\i — 1, so by 
the induction hypothesis 

g{u Vv)- g{u) < g(u V v) - g{u) < g(v) - g(u A v) = g(v) - g(u A v) 

provided that all labelings involved belong to X*. This fact is proven below. 

Let us show that u G X*. If uy = 1 then [u]i = (1,1). u is obtained from u by switching [u]i from 
(1, 1) to (0, 1), so u G X* implies u G X* . Suppose that uy = 0; this means that i was not selected in 
rule (a). If u ^ X~ then there exists j G V with [u]j = (1,1). Since case (a) was not "triggered", we must 
have [v]j = (1, 1). But then [u A v}ij = (0, 0, 1, 1) so u A v $ X* - a contradiction. Thus, u G X~ and 
so u G X~ . 

Let us now show that w = u\/ v £ X*. Clearly, w is obtained from w = u V v by switching label 
Wi from 1 to 0. If wy = 1 then [w]i = (1, 1) switches to [w]i = (0, 1), so w G X* implies w G X*. 
Suppose that wy = 0, then uy = vy = 0; this means that rules (a) and (b) were not "triggered". Let us 



prove that w € X~; this will imply w € X~ . Suppose not, then there exist j G V with [w]j = (1, 1). The 
case [v]j = (1, 1) is impossible since then we would have [v]ij = (0, 0, 1, 1) and v <£ X* - a contradiction. 
Thus, we can assume by symmetry that Vj> = 0, so uy = 1. We must have Uj = 1 or Vj = 1, which means 
that either rule (a) or (b) would be triggered - a contradiction. 

Direction (c)=Kd) Let u, v be labelings with the properties of condition (d). Thus, u = w\Ze l , v = wVe-i 
and Wi = Wj = 0. 

Suppose that j = i', so [w} t = (0,0). We must have [w] k € {(0, 1), (1,0)} for all k G V\{i,j}, 
otherwise we would have either u A v ^ X* or u\J v ^ X*. Therefore, u, v e X° and g(uUv) = g(w) = 
g(w') = g(u V v), so © follows from ©. 

We now assume that j / i and j / i'. We have [w]ij = (0, ?, 0, ?). Cases [w]ij = (0,0,0,1) 
and [w]ij = (0,1,0,0) are impossible since then either u or v would not belong to X* . If [w]ij = 
(0, 0, 0, 0) then u,v *E X~ and u\J v = u\/ v, so© follows from (©. It remains to consider the case 
[w]ij = [w']ij = (0, 1, 0, 1). Labeling u' is obtained from w' by switching the label of node i' from 1 to 0. 
Similarly, v' is obtained from w' by switching the label of node / from 1 to 0. We have [u']ij = (0, 0, 0, 1), 
[v']ij = (0, 1,0, 0), so u', v' € X~. Furthermore, u' U v' = u' V v'. Therefore, 

g(u \/v)+ g(u A v) = g(u' A v') + g{u' V v') = g(u' n v') + g(u' U v') < g{u') + g{v') = g{u) + g{v) 

Appendix B: Proof of theorem CU 

In order to simplify the proof (i.e. reduce the number of considered cases) we will use the "flipping" 
operation described in section 01 

For u € X we define sets Voo[w] = {i G V \ (nj, -u^) = (0, 0)}, VooM = {i £ V \ (ui, Ui>) = (0, 0)}. 
In this appendix we denote u % to be the labeling obtained from labeling u € X by setting the label of node 
i 6 V to 1, i.e. u % = «V e % . Similarly, we denote u l i = u V e* V e J . 

Lemma 13. Suppose that u € X and i,j are distinct nodes in Vqq[u] satisfying the following conditions: 
(i)ifi,j G V then the term fij(-, ■) (if it exists) is non-submodular; (ii)ifi,j € V\V then the term fi'j'(-,-) 
(if it exists) is non-submodular; (Hi) if i € V, j £ V\V then the term fij>(-, •) (if it exists) is submodular; 
(iv) ifi € V\V,j G V then the term fi>j(-, ■) (if it exists) is submodular. Then 

g(u)+g(u l i)=g(u i )+g(ui) (14) 



Proof. It suffices to prove the lemma in the case when function / in eq. (fTTT) has a single term; the general 
case will then follow by linearity. If this term does not involve nodes i/i' and j/j' then the claim is trivial 
since then g(u) does not depend on Ui or Uj. Thus, we consider terms involving at least one of the nodes 

i,i',j,f- 

Suppose that j = i'; without loss of generality we can assume that i € V. If f(x) = fi(xi) then the 
LHS and the RHS of (fT4l equal /j(0) + /i(l). If f{x) = fik(%i,%k) an d term /»&(•, •) is submodular then 
the LHS and the RHS of CH} equal |[/jjt(0, u k ) + f ik (l,u k /) + / ifc (l, u k ) + /ifc(0,«fe/)]. The case when 
f(x) = fi k (xi,x k ) and term fi k (-, •) is non-submodular can be reduced to the previous one by flipping 
node k. 

Now suppose that j ^ i'. By assumption, (ui,Ui/,Uj,Uj/) = (0,0,0,0). Using flipping, we can 
ensure that i, j G V. (Note that flipping i and/or j preserves conditions (i)-(iv).) Suppose f(x) involves 
exactly one of the nodes i,j, say node i. If f(x) = fi(xi) then the LHS and the RHS of (fT4l equal 
i[/i(0) + 3/i(l)]. If f(x) = fjk(xj,x k ) and term f ik (-, ■) is submodulai" then the LHS and the RHS of (fT4l> 
equal ^[f ik (0,u k ) + f ik (l,u k >) + /ifc(l, Wfc) + fik0-,u k >)}. The case when f(x) = f ik (xi,x k ) and term 
fik(-, •) is non-submodular can be reduced to the previous one by flipping node k. It remains to consider 
the case when f(x) = fij(xi,Xj). By the lemma's assumption, term /#(-, •) is non-submodular, so the 
LHS and the RHS of C3J equal ^[/ij(0,l) + /ij(l,0) + 2/y(l,l)]. □ 
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Figure 1: Examples of bisubmodular functions, (a-c) Cardinality-dependent functions g : X~ — > R written 
as g(u) = G(noi[u],nio[u]) where n aj g[u] = \{i G V \ (ui,Ui>) = (a, /3)}|. Here n = 3. (a) Convention 
for displaying function G. (b,c) Bisubmodular relaxations of the same function /. Function (c) can be 
extended to a submodular relaxation, while (b) cannot be extended, (d) Function / of n = 4 variables 
which has a tight bisubmodular relaxation, but all submodular relaxations are not tight. 



Lemma 14. Labeling u G X with Vqq[u] = {i} satisfies g(u) < g(u). 



Proof. We have 2g(u) < g(u l ) + g(u l ) = g(u l ) + g(u l ) = g{u) + g(u 

since g is bisubmodular and u l n u l = u l U u l 

u\ u l G X° , (3) holds by lemmafTBl and (4) holds since g(u n ) = g(u') = g(u). 

Lemma 15. There holds g{u) — g(u l ) < g(u) — g(u l ) for all u G X~ and i G FqoM- 



* — 2g(u) where (1) holds 
u, (2) holds since g and g are relaxations of / and 

□ 



Proof. We use induction on | Voo[ w ] I- If VooM = then the claim is trivial. If yoot 11 ] = {M'} then the 
claim follows from lemma [14] and the fact that g{u l ) = g(u l ) which holds since u % G X°. Now suppose 
that there exists j G VooM\{z, i'}. We can assume without loss of generality that labeling u and nodes 
i, j satisfy conditions of lemma[[3j (If not, we can replace j with j'). We can write 



(i) 



(2) 



(3| 



g(u) - g(u l ) < g{u>) - g(u*>) < g(u>) - g(u») * g{u) - g{v?) 

where (1) holds since g is bisubmodular, (2) holds by the induction hypothesis and (3) follows from lemma 
El □ 

We are now ready to prove theorem [TT] i.e. that g(u) < g{u) for any u G X~. We use induction on 
|yoo[ M ]|- The case Voo[w] = is trivial and the case Voo[«] = {i} follows from lemma [T4l Suppose that 
there exists j G Voo[t*]\{£, i'}. As before, we can assume without loss of generality that labeling u and 
nodes i,j satisfy conditions of lemma [T3l We can write 



in 



(2) 



g(u) < g(u l ) + [g(u l ) - g(u«)] < g(u l ) + [*,(«*) - g(u^)} = g{u) 

where (1) holds since g is bisubmodular, (2) holds by the induction hypothesis and lemma [TJi and (3) 
follows from lemma [T3l 



Appendix C: Examples of bisubmodular functions 

First, let us consider cardinality-dependent functions g : X~ — > R, i.e. functions which can be expressed 

as 

g{u) = G(n i[u\,n 10 [u\) 
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where n a p[u] = \{i G V | {ui,U{i) = (a,/3)}| for a labeling u € X and function G is denned over 
D n = {(a, b) € Z 2 | a, b > 0, a + 6 < n}. Using proposition [TOt c) it is easy to check that / is bisubmodular 
if and only if G satisfies the following: 



G(a, b) + G(a - 2, 6) 

G(a, b) + G(a, b - 2) 

G(a, b) + G(a - 1, b - 1) 

2G(a,6) 



< 2G(a-l,6) 

< 2G(a,6-2) 

< G(a - 1, b) + G(a, b 

< G(a + 1,6+1) 



V(a,6) eD n ,a> 2 (15a) 

V(a, 6) e D n , 6 > 2 (15b) 

V(a,6) €D n ,a> 1,6> 1 (15c) 
V(a, 6) G L> n , a + 6 = n - 1 (15d) 



Proposition 16. Consider function G : D% — > M defined by Figure \5\b). Let g be the corresponding 
cardinality-dependent function of2n = 6 binary variables. Then (a) g is bisubmodular and (b) it cannot 
be extended to a submodular relaxation X — >• 1L 

Proof. Verifying that function G satisfies ([TBI is straightforward, so we focus on the claim (b). Suppose a 
submodular relaxation g : X — > R that extends g does exist. Without loss of generality we can assume that 
g(u) = G(noi[u],nio[u],noo[u],nn[u]) for some function G over D% = {(a,b,c,d) € 1? \ a,b,c,d > 
0, a + b + c + d = 3}. Indeed, let II be the set of 3!= 6 permutations of V. Any permutation tt £ U 
defines a mapping t/v : X — > X in a natural way. Define function ^ : ^f — > R by g^-(tt) = g(^ n (u)). 
Clearly, 5^ is also a submodular relaxation extending g, and therefore so is the function g* : X — )• R given 
by^(u) = -pl]7ren57r(u)- Clearly, 5* (tt) depends only on the counts n i[w],nio[w],n oM,«ii[w]. 

If c = or (i = for (a, 6, c, d) G D3 then G(a, 6, c, d) = G(a, b). Thus, there are 4 unknown values: 
G(0, 1, 1, 1), G(l, 0, 1, 1), G(0, 0, 1, 2), G(0, 0, 2, 1). We will write labelings in X as ( ; 
{i, j, k} = U. From submodularity of g we get 



y "fc ) where 
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G(i, 0,1,1) + < 1 + 

1 + < G(l, 0,1,1) +0 



which implies G(0, 1, 1, 1) 
sistency: 



0, G(l, 0, 1, 1) = 1. Additional submodularity inequalities lead to an incon- 
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1 1 
1 1 



1 1 - 
1 0. 



1 + < G(0, 0,1,2) +0 
+ G(0,0,l,2) <0 + 



□ 

It should be said that in this particular example / has a submodular relaxation g with the same minimum 
as g (the restriction of g to X~ is shown in Figure [3c)). Although g(u) < g(u) for some u £ X~ , both 
functions attain the minimum of —1 at u = 0. Using a computer implementation with an exact rational 
LP solver QSopt [1] we found other examples of functions / with n = 4 variables which have tight 
bisubmodular relaxations g (i.e. g has a minimizer in X°), but all submodular relaxations are not tight. 
One such example is shown in Figure [5td); in this example, the minima of the tightest bisubmodular and 
submodular relaxations are and —3/10, respectively. 



12 



References 

[1] Asem M. Ali, Aly A. Farag, and Georgy L. Gimel'Farb. Optimizing binary MRFs with higher order 
cliques. In ECCV, 2008. 

[2] Kazutoshi Ando, Satoru Fujishige, and Takeshi Naitoh. A characterization of bisubmodular functions. 
Discrete Mathematics, 148:299-303, 1996. 

[3] M. L. Balinski. Integer programming: Methods, uses, computation. Management Science, 12(3):253- 
313, 1965. 

[4] E. Boros and P. L. Hammer. Pseudo-boolean optimization. Discrete Applied Mathematics, 123(1- 
3): 155 - 225, November 2002. 

[5] E. Boros, P. L. Hammer, and X. Sun. Network flows and minimization of quadratic pseudo-Boolean 
functions. Technical Report RRR 17-1991, RUTCOR,May 1991. 

[6] E. Boros, P. L. Hammer, and G. Tavares. Preprocessing of unconstrained quadratic binary optimiza- 
tion. Technical Report RRR 10-2006, RUTCOR, 2006. 

[7] A. Bouchet. Greedy algorithm and symmetric matroids. Math. Programming, 38:147-159, 1987. 

[8] A. Bouchet and W. H. Cunningham. Delta-matroids, jump systems and bisubmodular polyhedra. 
SIAMJ. Discrete Math., 8:17-32, 1995. 

[9] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. PAMI, 
23(11), November 2001. 

[10] R. Chandrasekaran and Santosh N. Kabadi. Pseudomatroids. Discrete Math., 71:205-217, 1988. 

[11] S Fujishige. Submodular Functions and Optimization. North-Holland, 1991. 

[12] Satoru Fujishige and Satoru I wata. Bisubmodular function minimization. SIAM J. Discrete Math., 
19(4): 1065-1073, 2006. 

[13] P. L. Hammer, P. Hansen, and B. Simeone. Roof duality, complementation and persistency in quadratic 
0-1 optimization. Mathematical Programming, 28:121-155, 1984. 

[14] D. Hochbaum. Instant recognition of half integrality and 2-approximations. In 3rd International 
Workshop on Approximation Algorithms for Combinatorial Optimization, 1998. 

[15] D. Hochbaum. Solving integer programs over monotone inequalities in three variables: A framework 
for half integrality and good approximations. European Journal of Operational Research, 140(2):291- 
321, 2002. 

[16] H. Ishikawa. Higher-order clique reduction in binary graph cut. In CVPR, 2009. 

[17] H. Ishikawa. Higher-order gradient descent by fusion-move graph cut. In ICCV, 2009. 

[18] Satoru Iwata and Kiyohito Nagano. Submodular function minimization under covering constraints. 
In FOCS, October 2009. 

[19] Santosh N. Kabadi and R. Chandrasekaran. On totally dual integral systems. Discrete Appl. Math., 
26:87-104, 1990. 

[20] V. Kolmogorov. Minimizing a sum of submodular functions. Technical Report arXiv: 1006. 1990vl, 
June 2010. 

13 



[21] V. Kolmogorov. Submodularity on a tree: Unifying L ''-convex and bisubmodular functions. Technical 
Report arXi v:1007.1229V 2. July 2010. 

[22] Victor Lempitsky, Carsten Rother, and Andrew Blake. LogCut - efficient graph cut optimization for 
Markov random fields. In ICCV, 2007. 

[23] Victor Lempitsky, Carsten Rother, Stefan Roth, and Andrew Blake. Fusion moves for Markov random 
field optimization. PAMI, July 2009. 

[24] S. H. Lu and A. C. Williams. Roof duality for polynomial 0-1 optimization. Math. Programming, 
37(3):357-360, 1987. 

[25] S. Thomas McCormick and Satoru Fujishige. Strongly polynomial and fully combinatorial algorithms 
for bisubmodular function minimization. Math. Program., Ser. A, 122:87-120, 2010. 

[26] M. Nakamura. A characterization of greedy sets: universal polymatroids (I). In Scientific Papers of 
the College of Arts and Sciences, volume 38(2), pages 155-167. The University of Tokyo, 1998. 

[27] G. L. Nemhauser and L. E. Trotter. Vertex packings: Structural properties and algorithms. Mathemat- 
ical Programming, 8:232-248, 1975. 

[28] Liqun Qi. Directed submodularity, ditroids and directed submodular flows. Mathematical Program- 
ming, 42:579-599, 1988. 

[29] A. Raj, G. Singh, and R. Zabih. MRF's for MRFs: Bayesian reconstruction of MR images via graph 
cuts. In CVPR, 2006. 

[30] Stefan Roth and Michael J. Black. Fields of experts. IJCV, 82(2):205-229, 2009. 

[31] C. Rother, V. Kolmogorov, V. Lempitsky, and M. Szummer. Optimizing binary MRFs via extended 
roof duality. In CVPR, June 2005. 

[32] O. Woodford, P. Torr, I. Reid, and A. Fitzgibbon. Global stereo reconstruction under second order 
smoothness priors. In CVPR, 2008. 



14 



